Concerning that the traditional Web text clustering algorithm without considering the Web text topic information leads to a low accuracy rate of multi-topic Web text clustering, a new algorithm was proposed for Web text clustering based on the topic theme. In the method, multi-topic Web text was clustered by three steps: topic extraction, feature extraction and text clustering. Compared to the traditional Web text clustering algorithm, the proposed method fully considered the Web text topic information. The experimental results show that the accuracy rate of the proposed algorithm for multi-topic Web text clustering is higher than the text clustering method based on K-means or HowNet.
To improve the accuracy of recommended Web resources, a personalized recommendation algorithm based on ontology, named BO-RM, was proposed. Subject extraction and similarity measurement methods were designed, and ontology semantic was used to cluster Web resources. With a user's browser tracks captured, the tendency of preferences and recommendation were adjusted dynamically. Comparison experiments with collaborative filtering algorithm based on situation named CFR-RM and personalized prediction algorithm based on model were given. The results show that BO-RM has relatively stable overhead time and good performance in Mean Reciprocal Rank (MRR) and Mean Average Precision (MAP). The results prove that BO-RM improves the efficiency by using offline data analysis for large Web resources, thus it is practical. In addition, BO-RM captures the users' interest in real-time to updates the recommendation list dynamically, which meets the real needs of users.